Picture for Zhenghao Zhang

Zhenghao Zhang

PaperX: A Unified Framework for Multimodal Academic Presentation Generation with Scholar DAG

Add code
Feb 05, 2026
Viaarxiv icon

ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web Search

Add code
Jan 30, 2026
Viaarxiv icon

SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving

Add code
Dec 11, 2025
Viaarxiv icon

Dynamic Deep Graph Learning for Incomplete Multi-View Clustering with Masked Graph Reconstruction Loss

Add code
Nov 14, 2025
Viaarxiv icon

Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learning

Add code
Oct 16, 2025
Viaarxiv icon

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Add code
Jul 31, 2024
Viaarxiv icon

BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight

Add code
Jul 11, 2024
Figure 1 for BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight
Figure 2 for BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight
Figure 3 for BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight
Figure 4 for BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight
Viaarxiv icon

MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps

Add code
Jul 11, 2024
Figure 1 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 2 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 3 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 4 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Viaarxiv icon

EffiVED:Efficient Video Editing via Text-instruction Diffusion Models

Add code
Mar 18, 2024
Figure 1 for EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
Figure 2 for EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
Figure 3 for EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
Figure 4 for EffiVED:Efficient Video Editing via Text-instruction Diffusion Models
Viaarxiv icon

Understanding Long Range-Frequency Hopping Spread Spectrum (LR-FHSS) with Real-World Packet Traces

Add code
Dec 21, 2023
Viaarxiv icon